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The zebrafish larva stands out as an emergent model organism for translational studies 
involving gene or drug screening thanks to its size, genetics, and permeability. At the 
larval stage, locomotion occurs in short episodes punctuated by periods of rest. Although 
phenotyping behavior is a key component of large-scale screens, it has not yet been 
automated in this model system. We developed ZebraZoom, a program to automatically 
track larvae and identify maneuvers for many animals performing discrete movements. 
Our program detects each episodic movement and extracts large-scale statistics on motor 
patterns to produce a quantification of the locomotor repertoire. We used ZebraZoom to 
identify motor defects induced by a glycinergic receptor antagonist. The analysis of the 
blind mutant atohl revealed small locomotor defects associated with the mutation. Using 
multiclass supervised machine learning, ZebraZoom categorized all episodes of movement 
for each larva into one of three possible maneuvers: slow forward swim, routine turn, and 
escape. ZebraZoom reached 91 % accuracy for categorization of stereotypical maneuvers 
that four independent experimenters unanimously identified. For all maneuvers in the data 
set, ZebraZoom agreed with four experimenters in 73.2-82.5% of cases. We modeled 
the series of maneuvers performed by larvae as Markov chains and observed that 
larvae often repeated the same maneuvers within a group. When analyzing subsequent 
maneuvers performed by different larvae, we found that larva-larva interactions occurred 
as series of escapes. Overall, ZebraZoom reached the level of precision found in 
manual analysis but accomplished tasks in a high-throughput format necessary for large 
screens. 



Keywords: machine learning, tracking, analysis of kinematics, collective behavior, support vector machine 
classifier, multiclass categorization, locomotion in intact behaving animals 



A central question in systems neuroscience is how neural cir- 
cuit assembly and function relate to animal behavior. Genetic 
screens in invertebrate models, such as Drosophila melanogaster 
and Caenorhabditis elegans have begun to unravel the genetic 
basis of circuit function and behavior (Chalfie et al, 1985; Moore 
et al., 1998; Scholz et al, 2000). Automated methods have recently 
been developed in these species to track the position of individ- 
uals alone or in a group (Branson et al., 2009; Swierczek et al., 
2011) and to categorize behavior (Dankert et al, 2009; Kabra 
et al, 2013). The zebrafish has emerged as an important verte- 
brate model organism for developmental biology, neurobiology, 
and human disease models, and is now used as a genetic model 
organism for the study of the mechanisms modulating complex 
behaviors in vertebrates such as depression and anxiety (Blaser 
et al, 2010; Lee et al, 2010; Cachat et al, 201 1; Vermoesen et al, 
201 1; Zakhary et al, 201 1; Ziv et al, 2013), sleep (Zhdanova et al, 
2001; Appelbaum et al, 2009), or addiction (Petzold et al., 2009; 
Khor et al, 2011). The permeability, small size, genetic tractabil- 
ity, transparency, and low cost of zebrafish make them highly 
suitable for large-scale genetic and chemical screens (Driever 



et al., 1996; Granato et al, 1996; Haffter and Nusslein-Volhard, 
1996). 

Although simple for a vertebrate, the locomotor patterns of the 
zebrafish larva bring technical challenges to automated analysis. 
Larvae spontaneously swim in discrete bouts in a manner often 
described as "beat and glide," which can be classified as individual 
maneuvers, including slow forward swim, routine turn, or escape. 
These short movements are characterized by a large range of tail- 
beat frequencies (15-100 Hz), which require high-speed imaging 
to capture accurately and can be separated by long resting peri- 
ods of up to a few seconds. Manual tracking via frame-by-frame 
analysis has formed the basis of contemporary knowledge and 
has enabled initial characterization of the larval zebrafish loco- 
motor repertoire (Budick and O'Malley, 2000; Borla et al., 2002; 
McEUigott and O'Malley, 2005). However, manual techniques are 
both laborious and limited in scope for high-throughput screens 
(Driever et al, 1996; Granato et al, 1996; Haffter and Nusslein- 
Volhard, 1996). The currently available automated tools have 
limitations in either refinement or time-scale. Recent chemical or 
genetic screens have relied on commercial software that estimates 
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an index of mobility of the larvae, usually measured as the dis- 
tance traveled during a recording session or the amount of time 
spent moving (Rihel et al, 2010; Elbaz et al., 2012; Rihel and 
Schier, 2012). These approaches for high-throughput screens pro- 
vide information about average velocity and distance traveled 
by tracking the animals' center-of-mass over minutes to hours. 
Previous studies have either focused on analyzing movement 
duration and speed at low frequency over long periods of time or 
on fine analysis of kinematics at high frequency but for very short 
acquisition (typically 1000 ms, Burgess and Granato, 2007; Liu 
et al., 2012). Accurate categorization of maneuvers for each indi- 
vidual in a group requires novel methods to record behavior with 
high temporal resolution and over long durations, automatically 
tracking and categorizing thousands of maneuvers. 

Here we developed a new program, ZebraZoom, to track the 
full body position over a multiple-minute timescale of 56 larvae 
simultaneously recorded at high frequency and to finely charac- 
terize each maneuver. To identify core and tail positions for large 
datasets, videos were obtained on multiple larvae simultaneously 
over long periods of time and at high resolution using a high- 
speed camera run in a streaming-to-disk interface (Methods). 
Typically 500-1000 movements from seven larvae were recorded 
per dish in four minutes and eight dishes were monitored in par- 
allel. To simplify tracking, we placed larvae in conditions that 
reduced overlapping in the z-plane during swimming (Methods; 
overlaps occurred on average once every 145 s per larva). We 
developed an offline 2D tracking method for identifying and 
separating each animal even when in close contact (Methods, 



Figure 1). For each larva several features were identified, a core 
position that included the head and swim bladder (Figure 1A) 
and ten points along the tail (Methods and Figure IB; Video SI). 
As movements occurred as discrete episodes, ZebraZoom 
detected movements based on the tail-bending angle over time 
(Methods and Figures 1C-D). To validate the accuracy of move- 
ment detection, one trained experimenter manually identified all 
movements occurring in a subset of videos. In three videos rep- 
resenting a total of 189 events, movements occurred with a false 
negative rate of 2.7% and a false positive rate of 3.7%. 

To quantify movements in a consistent manner, we used the 
location of the head, the position of the tail, the heading direction 
and the tail-bending angle to estimate global parameters of loco- 
motion (Figure 2A, Methods). We observed that movements for 
5-7 dpf wild-type (WT) larvae occurred every 2.22 s on average 
per larva (at 0.4495 ± 0.01 17 Hz). For all movements identified, 
larvae performed on average 3.19 ± 0.01 oscillations per move- 
ment, had a 24.29 ± 0.03 Hz tail-beat frequency (TBF), lasting 
189.5 ± 0.0004 ms with a 51.14 ± 0.18° heading direction range, 
2.49 ± 0.008 mm traveled distance and 13.35 ± 0.04 mm/s speed 
per maneuver. We illustrated the use of ZebraZoom for quanti- 
fying the effects of a known glycinergic receptor antagonist, and 
for analyzing a blind genetic mutant. Glycine is responsible for 
reciprocal inhibition in the spinal cord that permits left-right 
alternation to sustain oscillations (Dale, 1985; Grillner et al., 
1995; Granato et al, 1996; Drapeau et al., 2002; Li et al, 2004). 
In zebrafish, mutants for glycinergic receptors or transporters 
have been associated with defects in motor pattern generation 



A i 




FIGURE 1 | Image processing for tracking of larvae's core positions and 
larvae's tail and detection of movements based on the tail-bending 
angle. (A) Tracking of the larvae's core positions. (Ai) Initial image. 
(Aii) Background image. (Aiii) Image with background subtracted. 
(Aiv) Binary image. (Av) Eroded image. (Avi) For each larva, identification 
of the core (blue dot) and heading direction (red axis). (B). Identifying the 
tip of the tail. (Bi) The head center is located at the boundary of the head 
and trunk. Candidate Point 1-4 along the tail are the four points of the 



500 ms 



contour with the smallest x-value, smallest y-value, largest x-value, and 
largest y-value caudal to reference points 1 and 2. (Bii) The two distances 
d1 and d2 shown for candidate point 1. (Biii) The two vectors used to 
identify the tail tip defined with the minimal scalar product for candidate 
point 1. (C) Definition of the tail-bending angle (a) separating the body axis 
(pink) and the line connecting the core and the tip of the tail (green). 
(D) Example of the tail-bending angle over time with detection of 
movements indicated by the pink line. 
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FIGURE 2 | Global parameters describing locomotion in wild-type, the glycinergic receptor antagonist strychnine on the global parameters 

mutant, or drug-treated larvae. (A) Distribution of global parameters of movements. White circles are before application, gray are after 

of movements for 5-7 dpf WT larvae. From left to right: Number of application (two videos, 42 larvae for each condition, two clutches, 

oscillations per movement, TBF in Hz across all movements, duration 6-7 dpf, 10,459 movements). (C) Effect of the atoh7 mutation on the 

of each movement in ms, heading direction range in degrees, global parameters characterizing movements (four videos, 112 mutants 

distance traveled per movement in mm, speed in mm/s during each atoh7~t~, and 112 control siblings, four clutches, 6dpf). For (B,C): 

movement (eight videos, 420 larvae, six clutches, 5-7 dpf, 44,688 error bars are standard errors of the mean and statistics were 

movements). All values were calculated per movement. (B) Effect of calculated per larva. 



(Granato et al, 1996; Odenthal et al, 1996; Hirata et al., 2005; 
Masino and Fetcho, 2005). We measured the effect of bath appli- 
cation of 75 |xM strychnine on spontaneous locomotor activity 
in larvae and compared to control siblings that were not exposed 
to the drug (Figure 2B, Methods). For control larvae, we did not 
observe a significant change in the occurrence of movements over 
time (0.35 ± 0.05 movements per larva/s before and 0.27 ± 0.03 
movements per larva/s after), or on any of the global param- 
eters (Figure 2B; all p > 0.15). However the locomotor behav- 
ior of larvae treated with strychnine was significantly impacted 
(Figure 2B). Overall, movements occurred less frequently (0.30 
±0.04 Hz before and 0.12 ±0.02 Hz after, p < 0.0002). Although 
the average TBF during a movement did not change (p > 0.81), 
the number of oscillations decreased (3.52 ± 0.17 before and 2.89 
± 0.16 after; p < 0.0078), an effect that was associated with a 



decrease in movement duration (p < 0.0001), distance traveled 
ip < 10~ 5 ), and average speed (p < 10~ 5 ). Strychnine applica- 
tion also resulted in a decrease in the range of heading direction 
(p < 10~ 5 ). atoh7 mutant larva lack retinal ganglion cells, ren- 
dering them blind (atoh7~/~, Kay et al., 2001). Considering 
the importance of vision for zebrafish larvae, analyzing their 
locomotor output could reveal corresponding behavioral differ- 
ences. Overall atoh7~l~ mutants generated episodic movements 
less frequently than control siblings (0.33 ± 0.02 Hz vs. 0.51 ± 
0.02 Hz, 112 larvae for each condition). Quantitative analysis of 
global parameters of the blind mutants showed no difference 
in the average TBF or the average speed per larva (Figure 2C; 
p > 0.85 andp > 0.83, respectively) but there were small but sig- 
nificant decreases in the number of oscillations, duration, heading 
direction range, and distance traveled (Figure 2C; all p < 10~ 3 ). 
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These defects were observed systematically in four clutches. 
atoh7~l~ mutants thus display small but substantial differences 
in basic motor behavior when compared to control siblings. 

Zebrafish larvae display a variety of locomotor maneuvers that 
are often grouped into discrete categories. In these experimental 
conditions, three types of movement occur in groups of larvae 
at early stages: slow forward swims (S), routine turns (T, also 
referred to as slow turns), and escapes (E, including C-turns 
or burst swims). Figure 3 shows examples of these movements 
reported by ZebraZoom. For each maneuver, we superimposed 
a succession of images (Figures 3Ai-Ci), the tail-bending angle 
over time (Figures 3Aii-Cii) and the curvature along the rostro- 
caudal axis and as a function of time (Figures 3 Aiii-Ciii). The 
three types of maneuvers included a series of slow left-right alter- 
nation; high values of curvature were confined to the caudal 
tail (Figures 3Aiii— Ciii) . While high values of curvature of the 
tail were confined to the caudal end for slow forward swims 
(Figure 3Aiii), high values of curvature were distributed from 
head to tail for routine turns and escapes (Figures 3Biii,Ciii). 
Stereotypical routine turns and escapes differed by the frequency 
of left- right alternation in the tail bend (Figures 3Biii, Ciii). As 
larvae did not always exhibit a canonical slow forward swim, 
routine turn or escape, some movements were ambiguous. To 
estimate the percentage of these movements, four experimenters 
subjectively classified 390 movements distributed over eight 



videos. Overall about 82% of all movements were classified uni- 
formly by at least three out of four experimenters (Methods) 
indicating that 18% of movements were difficult to categorize. 

Using knowledge of stereotypical locomotor events, we 
designed a multiclass categorization approach with supervised 
machine learning to automatically sort each movement into one 
of the three categories. To implement the multiclass categoriza- 
tion, we used two successive support vector machine (SVM) 
classifiers: the first classifier sorted S vs. all other maneuvers, and 
when necessary the second classifier sorted T vs. E. Locomotor 
events were segregated subjectively in the training set (n = 201). 
This machine learning approach relied on associating dynamic 
parameters extracted from the tail-bending angle over time with 
each maneuver type identified in the training set (Figure 4A 
and Methods). To reduce the dimensionality of the data, we 

Table 1 | Estimation of ZebraZoom categorizing accuracy based on 
the different reference experimenters. 
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FIGURE 3 | Typical maneuvers occurring in groups of 5-6 dpf larvae. 

(A) Slow forward swim (S). (B) Routine turn (T). (C) Escape response 
(E). (Ai-Ci) Superimposed images taken every 17 ms. (Aii-Cii) The 
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tail-bending angle over time for each maneuver. (Aiii-Ciii) Plots of the 
curvature of the tail as a function of time and position along the rostro-caudal 
body axis. 
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performed Principal Component Analysis (PCA). Based on the 
selection of a trained experimenter on the learning set, we val- 
idated the multiclass categorization to sort maneuvers by com- 
parison with the subjective classification performed by a trained 
experimenter for a recognition set (n = 189; Figures 4B,C). We 
observed that ZebraZoom agreed with the trained experimenter 
82.5% of the time for the recognition dataset (85% for S, 82% 
for T, and 79% for E; Figures 4B,C; Tableland Methods). 
When compared to four independent experimenters, ZebraZoom 
reached 91% accuracy for categorization of stereotypical maneu- 
vers that all experimenters had unanimously identified and 
76.4% on average for all maneuvers (73.2-82.5%, Table 1). 
Once validated, we applied the ZebraZoom categorization algo- 
rithm on a large dataset of 44,688 movements of WT larvae 
(Figure 4D). We identified 14.91 1 S (33.36%), 21,432 T (49.96%), 
and 8,345 E (18.67%). The distribution of global parameters for 
the three classes of maneuvers were similar in terms of num- 
ber of oscillations and duration, but they differed in terms of 
mean TBF, heading direction range, distance traveled and speed 
(Figure 4D). 

The investigation of interactions between individuals lead- 
ing to coordinated motion in animal groups has been a long- 
standing challenge that is central to elucidating the mecha- 
nisms and evolution of collective behavior. Most studies have 
focused on the analysis of speed or directionality to reflect the 
interaction between animals (Katz et al., 2011; Gautrais et al., 
2012). We availed ourselves of ZebraZoom's features to accu- 
rately identify each larva and categorize their maneuvers to 
study how larvae interacted. In comparison to juvenile and adult 
zebrafish that swim continuously, larval zebrafish swim episod- 
ically with maneuvers that occur in a beat-and-glide manner. 
Each movement can be regarded as a discrete event, therefore 
we were motivated to explore how local perturbations of a sin- 
gle individual could impact the group. The program switched 
identity of larvae once every 109 s (once every 49 movements 
on average), allowing us to track single larvae. We modeled 
sequences of maneuvers performed by larvae within a group as 
Markov chains. Utilizing the classifier, we described larva-larva 
interactions in a group and intrinsic properties of individu- 
als. We calculated a transition index (7) for each sequence of 
two maneuvers as the transition probability between first and 
second maneuvers divided by the probability of random occur- 
rence of the second maneuver (Figure 5; Table 2 and Methods). 
When the two successive maneuvers were the same, a higher 
transition index indicated the probability of repetition of this 
maneuver was greater than chance. The transition index was 
equal to one when the order of sequential maneuvers was ran- 
dom. Overall 7 was greater than one for repetition of the 
same maneuvers (Table 2). We sorted the data into interactions 
between different animals and the repetition within the same 
animal. We analyzed how the transition index for a given suc- 
cession of maneuvers depended on the distance between the 
two larvae's core positions at the onset of the movement and 
the time between the onset of each movement (Figure 5 and 
Methods). Individual larvae often performed the same type 
of maneuver sequentially (maximal values Is max (same) = 1-43, 
h max (same) = 1-37, 7 E (same) = 2.38, all p < 0.002; Figure 5A, 



Table 2, and Methods). Although slow forward swims or rou- 
tine turns were not frequently repeated between larvae (7 close 
to 1: maximal values 7s (diff) = 1-09 and 7t (diff) = 1.01, p > 
0.05; Figures 5Bi,ii, Table 2, and Methods), we found that recur- 
rent escapes were very frequent between different larvae (max- 
imal value 7e (diff) = 3.6, p < 0.002; Figure 5Biii, Table 2, and 
Methods). Five to seven dpf larvae do not show evidence for 
social interactions (Buske and Gerlai, 2011). By taking advan- 
tage of the algorithm for identifying single larva and categorizing 
simple maneuvers, we reveal that larva-larva interactions pri- 
marily occurred for escape responses. These series of escapes 
occurred after direct collisions (in one third of the cases) or via 
long distance interaction (two third of cases). Blind atoh7~'~ 
larvae showed a similar profile of interactions for escapes (data 
not shown); these interactions were most likely mechanically 
triggered. 

Large-scale chemical and genetic screens would benefit from 
a quantitative approach to analyze fine locomotor patterns 
over long periods of time. Compared to other genetic models, 
zebrafish locomotion is difficult to analyze because larvae initi- 
ate maneuvers intermittently and during these short events, the 
larvae swim at a high speed with TBFs ranging from 15-100 Hz. 
The quantitative analysis of motor behavior for large-scale screens 
requires solving the problem of recording multiple animals simul- 
taneously at high frequency (above 200 Hz) and for long periods 
of time (minutes). Here we implemented a reliable method for 
quantifying global parameters of movements based on stream-to- 
disk recordings acquired at high frequency and over long periods 
of time, limited only by data storage. Next we developed a robust 
method for tracking the full body position of zebrafish larvae 
swimming in groups. We first manually validated that the track- 
ing accurately detected discrete movements, and then used the 
global parameters obtained to characterize the locomotion of 
WT larvae. Quantification of the global parameters describing 
larval movements corroborates previous observations based on 
fewer samples (Budick and O'Malley, 2000; Danos and Lauder, 
2007; Liu et al, 2012). Similar estimates of the duration of 
movements, distance traveled and speed were obtained from the 
recent application of C-trax (designed originally for Drosophila) 
to zebrafish larvae [Lambert et al. (2012) based on Branson 
et al. (2009)]. In these conditions, recordings at low frequency 
over long periods of time, typically 60 Hz for minutes or hours, 
revealed the global level of activity over time but no informa- 
tion on fine kinematics during individual maneuvers (Elbaz et al., 
2012). When recordings were performed at high frequency to cap- 
ture the dynamics of motion, they usually lasted 1000 ms (Burgess 
and Granato, 2007). 

We illustrated the benefit of ZebraZoom to quantify global 
parameters of movements by analyzing the effect of a drug to 
block glycinergic neurotransmission, which has been known to 
be involved in motor pattern generation and alternation between 
the left and right side of the spinal cord across vertebrate species 
(Grillner, 2003; Korn and Faber, 2005; Nishimaru and Kakizaki, 
2009). Most studies relied on ventral nerve root recordings where 
muscles were dissected out or paralyzed in order to record the 
activity of motor neurons at the level of a few segments at 
most. Our automated quantification of locomotor events enabled 
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FIGURE 4 | Validation of the automated categorization of maneuvers: 
slow forward swim (S), routine turn (T) and escape (E). (A) Dynamic 
parameters used for categorizing the different maneuvers: amplitude of 
tail-bending angle (TBA) in degrees, integrated TBA in degrees, TBF in Hz, 
and speed in mm/s. The mean of each parameter for each time bin is shown 
and error bars are standard error of the mean: S in pink, T in green and E in 
blue. Time 0 is taken at the peak of the first bend of the movement. 
(B,C) Comparison of the results of the automatic categorization from 



ZebraZoom with the subjective categorization by a trained experimenter on 
189 movements from one video. The comparison of the categorization is 
shown overall (B) and for each maneuver (C: Ci for S, Cii for T Ciii for E). The 
proportion of movements categorized the same way by both methods is 
shown in addition to the proportion of movements miscategorized and how 
they were categorized. (D) Distribution of global parameters for each 
maneuver S, T and E of WT larvae (same color code as in A; 44,688 
movements total from eight videos, six clutches). 
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FIGURE 5 | Larva-larva interactions occurred most frequently as 
sequences of escapes. Transition index for the same larvae (A) and 
for different larvae (B). Time is plotted in seconds from the time of 
initiation of the first movement. Distance is plotted from the core 



position of the larva at the beginning of a movement. (Ai.Bi) The 
sequence S-S. (Aii.Bii) The sequence T-T (Aiii.Biii) The sequence 
E-E (36,068 total movements from five videos, 280 WT larvae, four 
clutches, 5-6 dpf). 



Table 2 | Transition index for sequence of two maneuvers estimated 
as the probability of transition from maneuver 1 to maneuver 2 
divided by the occurrence of the maneuver 2. 



INTERACTIONS BETWEEN DIFFERENT ANIMALS 

S 1.0498 1.0152 

T 1.0087 1.0413 

E 0.8549 0.8493 

REPETITIONS WITHIN THE SAME ANIMAL 

S 1.2162 0.8947 

T 0.9091 1.1185 

E 0.8263 0.8887 



0.8481 
0.8606 
1.7528 

0.8413 
0.8499 
1 .6995 



identification of effects induced by bath application of the glycin- 
ergic antagonist strychnine on locomotion in intact animals. As 
predicted, bath application of strychnine dramatically reduced 
the occurrence of movements and the number of oscillations per 
movement, that was correlated with a reduction of the dura- 
tion of movement and of the distance traveled. While mean TBF 
was not affected, we observed a reduction in the heading direc- 
tion range and in speed. Our approach pinpointed effects of 
glycinergic blockade, including a reduction in the number of 
oscillations per movement, a kinematic feature not estimated in 
commercially available software. The analysis of the mutant atoh7 
revealed that although TBF and speed were not affected in the 
blind mutant, there was a small but significant decrease in the 
number of oscillations, heading direction range, distance, and 
duration of each bout compared to their control siblings. These 
effects were systematically observed on four clutches suggesting 
that visual feedback may impact some global parameters of loco- 
motion. However since the pattern of expression of atoh7 has not 



yet been fully characterized, it cannot be excluded that the gene 
may be expressed in cells other than retinal ganglion cells. 

The originality of ZebraZoom lies in categorizing all maneu- 
vers performed by individual larvae in a group. The subjective 
analysis of maneuvers based on four independent experimenters 
revealed that locomotor maneuvers were not obvious to catego- 
rize. Based on subjective estimates, 18% of all movements corre- 
sponded to ambiguous maneuvers. By using a machine-learning 
paradigm, we trained ZebraZoom to categorize all maneuvers 
over tens of thousands of movements with 82.5% accuracy, a sim- 
ilar value to the 72% agreement rate of all four experimenters 
measured over a few hundreds of movements. The approach we 
developed here could be expanded to include directionality of the 
turns, sequences of maneuvers such as those occurring during 
prey tracking, and subcategories of escapes. 

This study constitutes an important first step for accurate 
tracking of multiple larvae in groups over long periods of time 
and for categorizing maneuvers. Some improvements could be 
implemented in the future. While our tracking method currently 
relies on a simple "blob" approach solely based on raw image 
analysis, a model-based approach may be more reliable in partic- 
ular when animals are in close contact (Fontaine et al., 2008). We 
show here that ZebraZoom can achieve an accurate categoriza- 
tion of maneuvers, comparable to experimenters' estimates, based 
solely on the dynamics of movement of head and tail. An interest- 
ing avenue of exploration to address this could be investigation 
of novel dynamic parameters for the learning and recognition 
process of the classifier to yield subtler methods for detection of 
defects. Quantification of motor patterns in C. elegans is based 
on a description of all possible positions of the animal over 
time (Stephens et al, 2008). In order to fully understand larval 
zebrafish behavior we need to identify a minimal set of param- 
eters sufficient to describe all motor patterns. All together this 
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work brings new insight to the complexity of behavior determi- 
nation in zebrafish larvae and could be applied to investigation of 
the mechanisms of addiction, arousal, feeding, social interaction 
and aggression in larvae and juveniles (Gahtan et al, 2005; Bianco 
et al, 2011; Buske and Gerlai, 2011; Miller and Gerlai, 2012; Ziv 
et al., 2013). The observation of complex interactions in juve- 
niles raises the hope that it will soon be possible to investigate the 
neuronal circuits and molecular pathways underlying social inter- 
actions. The fact that we can track individual larva and analyze 
their interactions is a major advance over existing methods. Our 
approach that systematically quantifies and categorizes thousands 
of motor patterns was designed to bring efficiency and reliability 
to drug screening and forward genetic screens. ZebraZoom can 
detect, quantify, and categorize movements to provide a quan- 
titative description of global parameters as well as a qualitative 
description of all maneuvers performed by individual larvae. 

METHODS 

ZEBRAFISH HUSBANDRY 

All experiments were performed on Danio rerio larvae between 5 
and 7 dpf. AB and TL strains of WT larvae were obtained from 
our laboratory stock of adults. Embryos and larvae were raised 
in an incubator at 28.5°C under a 14/10 light/dark cycle (lights 
on, 8:00 A.M.; lights off, 10:00 P.M.) until the start of behavioral 
recordings. The mutant line for atoh7 (Kay et al., 2001) was given 
by Dr. Herwig Baier, MPI Munich. Double recessive atoh7~/~ 
mutants were identified at 5 dpf by their dark pigmentation. All 
procedures were approved by the Institutional Ethics Committee 
at the Research Center of the Institut du Cerveau et de la Moelle 
epiniere (CRICM). 

BEHAVIORAL RECORDINGS 

Motor behavior of 56 larvae split into eight dishes (seven 
larvae per dish, Figure lAi) on a homogeneous illumi- 
nation plate (light intensity 0.78 mW/cm 2 , Phlox, ref. 
LEDW-BL-200/200-LLUB-Q-1R24) in egg water (http://zfin. 
org/zf_info/zfbook/chaptl/l. 3.html, methylene blue added at 
0.5 ppm). Following acclimation, larvae were recorded for 4min 
at 337 Hz with a high-speed camera (VC-2MC-M340E0-C, 
CMOS chip 2048 x 1088 pixels, Vieworks, South Korea) placed 
above the setup and coupled to a camera objective (AF Nikkor 
50 mm f/1.8D, Nikon, Japan). Pixel size was 66 (Jim. We developed 
a direct-to-disk high-speed imaging system designed for long 
acquisitions of raw images in collaboration with R&D Vision, 
France. Behavioral recordings were performed between 2:00 
and 5:00 P.M. Larvae were acclimated for 60 minutes on the 
light source at room temperature (21-22°C) and kept at room 
temperature during all recordings. Larvae were kept in dishes 
with an inner diameter of 2.2 cm and an outer diameter of 3.5 cm 
(Figure 1A). Water was kept at a low level (2 mm) in order to 
reduce the occurrence of crossings between larvae. Typically 
500-1000 movements were recorded in each 4-min session for 
each well. 

ZEBRAZOOM TRACKING ALGORITHM 

The first step is to track the core and then the tail for all lar- 
vae over time. Written in C++ using the openCV library, the 



program identified the center position and heading direction of 
each larva (Figures lAi-vi). The algorithm used a Hough trans- 
form to identify the eight wells. For each well the background 
was estimated as the maximum pixel value over all frames of the 
video recording (Figure lAii) and then subtracted for all frames 
for that well (Figure lAiii). The resulting image was converted 
to binary (Figure lAiv). An erosion filter was applied twice in a 
row with a 3 by 3 structuring element (Figure lAv). The "core" of 
the larva referred to the resulting connected components that had 
an appropriate area (between 0.0871 and 0.8712 mm 2 ). The core 
of the larva included the head and the trunk with swim bladder 
(Figure lAvi). The algorithm identified the head center position 
as the center of mass of the putative cores for each larva in a frame. 
To follow each larva across subsequent frames, ZebraZoom used 
the information from the previous two frames (core position and 
speed) to predict the position of the larva and located the closest 
core out of all the possible cores. The heading direction for each 
larva was calculated simultaneously using the moments of the 
eroded body (up to the second order, see red lines in Figure lAvi). 

For each larva with an identified core, we determined the "full 
body" referring to the connected component of the binary image 
in Figure lAiv. In order to track the tail, the full body was rotated 
so that the head axis was parallel to the y-axis, always in the same 
orientation. To identify the contour of the tail in the coordinate 
system defined by the head axis, a series of points was extracted 
from the full body by using the algorithm of Suzuki and Abe 
(1985), (white dots in Figures IBi-iii). Reference point Al was 
the closest point on the contour line from the head center and 
reference point A2 was the point symmetrical along the head axis 
to reference point Al on the contour. In order to identify the tip of 
the tail, four candidate points on the contour were selected with 
minimal and maximal x- and y-values (Figures IBi-iii). For the 
maximal y-value the point also had to be above a given distance 
away from the two reference points [below a 20% threshold for 
the ratio \(d\ — d-L)\l{d\ + dj), Figure IBii] . Distances dl and d2 
were calculated from each candidate point to the reference points 
Al and A2 along the contour (Figure IBii). Candidate points 
with a ratio \(d\ — Ai)\l{d\ + dj) over 0.25 were excluded. The 
tip of the tail was then identified as the point associated with the 
smallest scalar product of the tangential vectors pointing in oppo- 
site directions (Figure IBiii). The midline of the larva was defined 
as the line equidistant to the contour line on the left and right side. 

ERRORS IN CORE AND TAIL TRACKING 

If an error occurred in the core tracking, the larva was missing 
for that frame and there was no tracking of its tail. If the core of 
a larva was identified, the algorithm proceeded to the tail track- 
ing. To confirm that the tail tracking was correct, the algorithm 
checked that the tail length was greater than 1.32 and less than 
3.96 mm. If this criterion was invalid, the tail position was set 
to the previous frame. This happened in 13.46% of frames on 
average but was compensated by a smoothing spline on the cen- 
ter positions between the left and right contour points of the tail 
and a median filter applied on the tail-bending angle over time. 
The tail-bending angle was defined as the angle between the axis 
formed by the tip of the tail and the center of the head with respect 
to the larva heading direction (Figures 1C,D). 
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SEPARATING LARVAE DURING CONTACTS 

Tracking was optimized to separate larvae in close vicinity to 
one another or in direct contact. For core tracking, if the tra- 
jectories of the two cores merged at a given time point, then 
the algorithm considered that a collision occurred between the 
two larvae. When the predicted positions of two larvae based on 
core position and speed in the two previous frames were clos- 
est to the same core, the algorithm considered that a collision 
between the two larvae occurred at that frame. When a collision 
was detected, the algorithm applied erosion filters in the region 
of interest defined by the core until more than one isolated core 
emerged. In rare cases, the multiple cores were not resolved and 
the larva could not be tracked for that frame. For the tail track- 
ing, if the area of the larva's full body was greater than 1.9 mm 2 , 
the algorithm considered that two larvae were in direct contact. 
The distance separating the larvae's cores determined which of 
two algorithms was used to isolate the tails: if the distance was 
less than 1.32 mm, a line separation algorithm was applied. A line 
was created to separate the two larvae by optimizing the area of 
the resulting tails, calculated by maximizing the sum of the two 
largest areas containing a head center position. If the distance 
was greater than 1.32 mm, a pixel intensity separation algorithm 
was applied instead. The threshold used to convert the image 
into binary was adjusted until two separate full bodies, each a 
connected component, emerged and contained the head center 
position. Larvae crossings occurred once every 145 s on average 
per larva (0.0069 ± 0.0019 events per second) and the switching 
of identification between two larvae after a collision was esti- 
mated manually to occur every 109 s on average (0.0092 ± 0.0036 
events per second per larva based on 720 s of recordings from four 
videos, and 28 larvae). 

DETECTION OF MOVEMENTS 

Algorithms for the detection of movements and the behavior 
analysis were written in MATLAB (The Mathworks, Inc., USA). 
The detection of movement was based solely on threshold- 
ing the tail-bending angle measured over time (Figures 1C,D). 
ZebraZoom detected the start of a movement when the value for 
the tail-bending angle at a given frame varied over 1.15° from 
the mean value of the tail-bending angle for the ten surround- 
ing frames, or 29.7 ms. To avoid separating single maneuvers 
into multiple events, movements that occur within 14.8 ms of 
each other were merged. To avoid false positives we considered 
only movements in which the larva core had moved more than 
0.099 mm and where the range of tail-bending angle values was 
above 2.86 degrees. Additionally, only events during which the 
eroded binary image of the larva had moved more than a set num- 
ber of pixels between subsequent images were considered based 
on the parameters used for the erosion. Rarely we have observed 
two distinct movements occurring without a pause, such as a slow 
forward swim followed by an escape due to a collision. In these 
few cases when two movements occurred without a noticeable 
stabilization in the tail-bending angle over time, the movements 
were merged into one movement in our analysis. 

Our tracking method was robust in these experimental con- 
ditions. We cannot probe the impact of a reduction of contrast 
or spatial resolution. All numerical thresholds used above for 



tracking were fixed empirically, but they could easily be modified 
for other users to adapt to other recording conditions. 

CALCULATION OF THE CURVATURE 

After alignment of the body axis with the y-axis in a consistent 
orientation, the tail was represented parametrically in Cartesian 
coordinates as [x(t), y(t)]. The midline of the tail was fitted to 
the x(t), y(t) function with a spline. Curvature was calculated in 
Cartesian coordinates: 

bcy'-/x"| 
c = j-- 

where the derivatives were all calculated with respect to f, the 
distance along the tail. 

EXTRACTION OF GLOBAL PARAMETERS 

For all frames of a video, ZebraZoom outputs variables for 
each larva in each dish including: the position of its core, head 
axis, midline position of its tail and tail-bending angle. For 
each detected movement, a reference number for the larvae was 
extracted along with the corresponding well number, start and 
end time of the movement, and global parameters such as the 
number of oscillations, TBF, movement duration, heading direc- 
tion range referring to the range of values of the heading axis for 
one movement with the heading angle reset to zero at the onset 
of movement, distance traveled, average speed (distance traveled 
divided by movement duration). 

AUTOMATIC MULTICLASS CATEGORIZATION 

We automatically attributed each movement detected in the video 
to either one of the three maneuvers: slow forward swim, routine 
turn or escape response. Our method relied on a dynamic set of 
parameters extracted from the bending angle of the tail estimated 
from the first tail bend over a limited time window (Figures 1C, 
4A). We based our categorization on the four following parame- 
ters: (1) the amplitude of the tail-bending angle (0-178 ms, bins 
of 12 ms), (2) the instantaneous frequency (0-104 ms, bins of 
7 ms), (3) the cumulative tail-bending angle calculated as the 
average angle value over time (0-178 ms, bins of 12 ms), and (4) 
the speed (0-240 ms, bins of 24 ms) (Figure 4A). The values of 
these four dynamic parameters were interpolated with a spline 
for a given time window during the movement and then used 
for categorization of every movement. PCA was first performed 
to reduce noise and dimensionality. Each movement was subse- 
quently represented by the fourteen first principal components of 
the PCA out of 53 components (representing all together about 
93% of the variance), to which the total duration of the movement 
was added. Multiclass categorization was implemented in two 
steps: a series of two subsequent SVM classifiers with linear kernel 
was applied for automatic categorization of movements: the first 
SVM classifier discriminated slow forward swims vs. turns and 
escapes, and if necessary a second SVM discriminated between 
a routine turn and an escape. We used two distinct datasets 
from WT 5-7 dpf larvae, one for learning the three maneuver 
types (five videos, n = 201 movements) and one for testing their 
recognition (three videos, n = 189 movements). 
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ESTIMATING THE RECURRENCE OF MANEUVERS 

Successions of maneuvers performed by larvae in a given dish 
were modeled as Markov chains. Out of the nine possible 
sequences of two maneuvers (S-S, S-T, S-E, T-S, T-T, T-E, E-S, 
E-T, E-E), we estimated the frequency of occurrence of each 
sequence. For a given movement classified as S, T, or E occurring 
at a given time in one dish, we calculated the transition proba- 
bility for the subsequent movement to be classified as S, T, or E. 
We calculated a weighted transition index (7) for each sequence of 
two sequential maneuvers as the ratio of the transition probability 
from the first maneuver to the second, divided by the probability 
of occurrence of the second maneuver (Table 2 for values of all 
transition indexes) . When 7 is equal to 1 , the probability of repeat- 
ing a maneuver is equal to the probability of random occurrence 
of the maneuver (probability of random occurrence was 0.35 for 
S; 0.48 for T and 0.16 for E; Table 2). Thus the index of recurrence 
7 was defined as: 



7(51,52) 



p(xi = Bl\xi- 



52) 



p{xi = 51) 



with 51 and 52 as two possible maneuvers (S, T, or E) and x;_ i 
and x\ as two successive movements. WT larvae were used to esti- 
mate the transition index (36,068 movements from 280 larvae 
originating from four clutches and obtained from 40 wells). To 
investigate the recurrence of maneuvers as a function of time and 
distance, we calculated 7 as a function of the distance separating 
the two head centers of the larvae at the onset of their respective 
movement and the time as the time interval between the onsets 
of the first and second movement. 7 was calculated for many dif- 
ferent time and distance windows. In Figure 5, we plotted these 
indexes for the sequences S-S, T-T, and E-E. We first calculated 
the index for the same larva (Figures 5Ai-iii) and across different 
larvae (Figures 5Bi-iii). 

STATISTICAL ANALYSIS 

The data used for Figure 2A were based on eight videos, 420 WT 
AB larvae from six different clutches between 5 and 7 dpf. All val- 
ues were given as mean ± standard error of the mean (s.e.m.) 
calculated per movement. For the pharmacology experiments 
(Figure 2B), strychnine was bath applied at 75 jjlM and the data 
were based on two videos of 84 WT larvae coming from two 
clutches (42 for controls and 42 for strychnine) between 6 and 
7 dpf. The data on atoh7~/~ mutants in Figure 2C were generated 
using four videos, 224 larvae total originating from four clutches 
(112 atoh7~/~ and 112 control siblings). All global parameters 
plotted in Figures 2B,C were calculated per larva then averaged 
across all larvae and means were given ± s.e.m. across all lar- 
vae. Since the distributions of global parameters were not normal, 
a standard non-parametric Wilcoxon rank sum test was used 
in MATLAB for calculating differences between conditions with 
vs. without drugs for Figure 2B and atoh7~/~ vs. siblings for 
Figure 2C. The data used for Figure 4D were based on 44,688 
movements from eight videos, 448 larvae, six clutches and for 
Figure 5 from 36,068 movements from five videos, 280 WT AB 
larvae from four clutches. To test how the maximal values of 
the transition index were different from random, we calculated 



Jmax after randomly permuting maneuvers while keeping track 
of the larva identity, time, and location the same for 50 itera- 
tions. For each comparison, S-S, T-T, E-E across different larvae 
or within the same larva, we compared the values of 7 max after 
randomization to the measured value 7 max using a two-sample 
T-test. 

DATA AND ALGORITHM SHARING 

The software ZebraZoom is documented and available online 
from Source Forge in the code tab (http://sourceforge.net/pZ 
zebrazoom/wiki/Home/). ZebraZoom requires MATLAB and 
works reliably on an Ubuntu 11.04 computer with OpenCV 
installed and MATLAB 7.10. 
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Video SI | ZebraZoom tracking of seven larvae in a dish. Acquisition 
was performed at 300 Hz, and one out of every ten images is displayed 
(every 33.3 ms). 
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